Skip to content

Move API docs redirects to CloudFront #366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 13, 2025

Conversation

matiasgarciaisaia
Copy link
Member

We've hit the limits of what S3 bucket redirections allow, so we're serving the 404 page with a 200 OK status code - breaking HTTP, and making [AI] crawlers go crazy coming up with paths.

We can now move that redirection logic to CloudFront, and be correct on our status codes.

We've hit the limits of what S3 bucket redirections allow, so we're
serving the 404 page with a `200 OK` status code - breaking HTTP, and
making [AI] crawlers go crazy coming up with paths.

We can now move that redirection logic to CloudFront, and be correct on
our status codes.
@matiasgarciaisaia
Copy link
Member Author

Deployment note: we still need to "apply" the function to the current CloudFront distribution for it to take effect. Also, we'll manually drop the old S3 bucket rules.

Comment on lines 5 to 9
if(requestPath.startsWith('/api/latest/')) {
redirectUri = requestPath.replace('/api/latest/', '/api/${CRYSTAL_VERSION}/')
} else if(requestPath.startsWith('/api/') && !(requestPath.startsWith('/api/0') || requestPath.startsWith('/api/1'))) {
redirectUri = requestPath.replace('/api/', '/api/${CRYSTAL_VERSION}/')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I think we also need to pass through paths with the /api/master/ prefix (for the HEAD version of the docs) and /api/versions.json (metadata).

Copy link
Member

@straight-shoota straight-shoota Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current s3 setup, we also have the following redirects which we might want to incorporate into cloudfront:

  • /api -> /api/ (302)
  • /api/ -> /api/latest (301)
  • /api/index.html -> /api/latest (301)
  • /api/latest to /api/latest/ (301)

Maybe they should all redirect directly to /api/${CRYSTAL_VERSION}/ without intermediaries and all be 302s? Or 301s to /api/latest/...

Then we could delete the files index.html and latest in the s3 bucket.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they can all be 301 to /api/latest/ (trailing slash), which would avoid some redirects and make the move explicit 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oohhh! Good catch.

The S3 rules don't mention anything about /api/master/ and /api/versions.json, and I was scratching my head trying to understand why does this work different, then?. But that's because S3 rules apply when S3 resolves to a 404 error. This CloudFront function was meant (so far, at least) to be applied before hitting S3.

My first instinct is to keep this function as simple as possible: move it to after the request went to S3, and only work with 404's (as we're doing now with S3); and keep the few redirects that @straight-shoota mentions that are files in the bucket (ie, not S3 "logic").

But if you have a strong reason to move all the logic to this function, I can do that too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also expecting this function to go before S3.
Because there's no point in bothering S3 when we can easily tell that the request wouldn't match anything.

But I'm not determined on whether that's actually better or not. I suppose most legitimate requests should not hit a 404 anyway. So maybe it's more efficient to have the rewrite only after S3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. But the different versions are easily accessible from the version selector on each API doc page.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant an internal doc to avoid situations where it's only in someone's 🧠.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, we probably should write that down somewhere. Although the CI workflows also document what's happening.

Btw. master is actually master, not nightly. It updates on every commit to the master branch, not just once per day.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CloudFront - Restrictions on all edge functions:

CloudFront doesn't invoke edge functions for viewer response events when the origin returns HTTP status code 400 or higher.

🤦

So we can't say "send the request to S3, and we'll do our thing if that fails" 😞

Don't know why they put this limitation in place, but it changes the framing of the function. We'll need to apply the redirects before trying to hit the bucket.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn. Well, the rules are static and small. It's doable to hardcode them.

The S3 bucket had two empty files that were redirection-only:

- api/index.html --> /api/latest
- api/latest --> /api/latest/

We now add them to the function's check.
CloudFront requires to publish the new version, and we have to get their
ETag each time.
CloudFront doesn't run viewer response functions when the origin
responds a 4xx or 5xx, so we have to hard-code which requests can pass
straight to the origin.

This function will need to get updated if we add extra files to the
`/api/` directory, such as a 2.x release, or other new files - I predict
this will be the source of a few bugs in the future.
@matiasgarciaisaia
Copy link
Member Author

OK, I think the function is now working OK.

I've set https://next.crystal-lang.org/ as a CloudFront distribution matching the production one, and replaced its API docs origin with a clone of the production one - except it doesn't have the S3 redirection rules nor the "symlinks" files.

Feel free to test every URL you can think of, so we merge this and publish on production.

@ysbaddaden
Copy link

The API sounds good to me.

@matiasgarciaisaia matiasgarciaisaia merged commit 6d3d4c6 into master Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants